Object posture estimation/correlation system using weight information

ABSTRACT

An object pose estimating and matching system is disclosed for estimating and matching the pose of an object highly accurately by establishing suitable weighting coefficients, against images of an object that has been captured under different conditions of pose, illumination. Pose candidate determining unit determines pose candidates for an object. Comparative image generating unit generates comparative images close to an input image depending on the pose candidates, based on the reference three-dimensional object models. Weighting coefficient converting unit determines a coordinate correspondence between the standard three-dimensional weighting coefficients and the reference three-dimensional object models, using the standard three-dimensional basic points and the reference three-dimensional basic points, and converts the standard three-dimensional weighting coefficients into two-dimensional weighting coefficients depending on the pose candidates. Weighted matching and pose selecting unit calculates weighted distance values or similarity degrees between said input image and the comparative images, using the two-dimensional weighting coefficients, and selects one of the comparative images whose distance value up to the object is the smallest or whose similarity degree with respect to the object is the greatest, thereby to estimate and match the pose of the object.

TECHNICAL FIELD

The present invention relates to an object pose estimating and matchingsystem for estimating and matching the pose of an object by matching aninput image of an object (including the face of a person) that has beencaptured under different conditions of pose, illumination, etc., againstreferenced images and three-dimensional object models stored in adatabase (DB).

BACKGROUND ART

One example of conventional object pose estimating and matching systemis disclosed in Shimada, et. al. “Method of constructing a dictionaryfor personal identification independent of face orientation” IEICETRANSACTIONS D-II, Vol. J78-D-II, No. 11, pages 1639-1649, 1995(hereinafter referred to as “first prior art”). As shown in FIG. 1, theobject pose estimating and matching system according to the first priorart has image input unit 10, normalizer 15, matching and pose selectingunit 41, and pose-specific reference image storage unit 85.

The conventional object pose estimating and matching system thusconstructed operates as follows: Pose-specific reference image storageunit 85 stores at least one pose-specific reference image captured ofone or more objects under one or various pose conditions. Eachpose-specific reference image is generated from one image or an averageof images captured for each pose. Image input unit 10 is implemented bya camera or the like, and stores a captured input image in a memory (notshown). Input images may be read from a recorded file or acquiredthrough a network. Normalizer 15 aligns an input image using featurepoints extracted from the object, and generates a normalized image. Inthe illustrated system, normalizer 15 aligns an input image bydetecting, as feature points, the positions of characteristic parts,e.g., an eye and a mouth. The pose-specific reference image is alsonormalized and stored. Normalized images often use features obtained bya feature extracting process. Matching and pose selecting unit 41calculates distance values (or similarity degrees) between thenormalized image and the pose-specific reference images of respectiveobjects obtained from pose-specific reference image storage unit 85, andselects one of the reference images whose distance value up to theobject is the smallest (whose similarity degree is the largest), therebyestimating an optimum pose. The distance values are calculated by usingthe normalized correlation or Euclidean distance, for example. If aninput image is matched against one object (one-to-one matching), thenthe minimum distance value is compared with a threshold value todetermine whether the input image is the same as the object or the not.If one of a plurality of objects (reference images) which is closest toan input image is searched for (one-to-N matching), then one of theobjects which has the smallest one of the minimum distance valuesdetermined up to the respective objects is extracted.

Another example of conventional object pose estimating and matchingsystem is disclosed JP-2003-58896A (hereinafter referred to as “secondprior art”). As shown in FIG. 2, the conventional object pose estimatingand matching system according to the second prior art has image inputunit 10, comparative image generator 20, pose candidate determining unit30, matching and pose selecting unit 41, and reference three-dimensionalobject model storage unit 55.

The conventional object pose estimating and matching system thusconstructed operates as follows: Reference three-dimensional objectmodel storage unit 55 registers therein reference three-dimensionalobject models of respective objects (three-dimensional shapes and objectsurface textures of the objects). Pose candidate determining unit 30determines at least one pose candidate. Comparative image generator 20generates a comparative image having illuminating conditions close tothose of the input image, based on the reference three-dimensionalobject models obtained from reference three-dimensional object modelstorage unit 55. Matching and pose selecting unit 41 calculates distancevalues (or similarity degrees) between the input image and thecomparative images, and selects one of the comparative images (posecandidate) whose distance value up to the model (object) is thesmallest, thereby estimating an optimum pose.

Still another example of conventional object matching system isdisclosed in Guo, et. al. “Human face recognition based on spatiallyweighted Hausdorff distance” Pattern Recognition Letters, Vol. 24, pages499-507, 2003 (hereinafter referred to as “third prior art”). As shownin FIG. 3, the conventional object matching system according to thethird prior art has image input unit 10, normalizer 15, weightedmatching unit 45, reference image storage unit 89, and weightingcoefficient storage unit 99.

The conventional object matching system thus constructed operates asfollows: Image input unit 10 and normalizer 15 operate in the samemanner as the components denoted by the identical reference numeralsaccording to the first prior art. Reference image storage unit 89 storesat least one reference image for each object. Weighting coefficientstorage unit 99 stores weighting coefficients for pixels (or features)to be used for comparing a normalized image and reference images.Weighted matching unit 45 calculates distance values (or similaritydegrees) between the normalized image and the reference images ofrespective objects obtained from reference image storage unit 89, andselects one of the reference images whose distance value is thesmallest, thereby matching the input image. If the Euclidean distance,for example, is used for calculating the distances, then a weightedEuclidean distance is calculated according to D=Σ_(r)w(r){x(r)−m(r)}²where x(r) represents the normalized image, m(r) the comparative image,and w(r) the weighting coefficient (r represents a pixel or featureindex).

The conventional object matching systems described above have thefollowing problems:

According to the first prior art and the second prior art, though a posecan be estimated and matched, the accuracy with which a pose isestimated and matched is lowered if a large local difference isdeveloped between an input image and reference images or comparativeimages from the DB due to local deformations of the object and differentimage capturing conditions.

The reasons for the above problem are that when the object is deformed,even if the pose of the object is generally aligned with those of thereference images or comparative images, the object has a local area notaligned with the reference images or comparative images, resulting indifferent pixel values (or features) in the local area. Even when theobject is not deformed and has aligned local areas, according to thefirst prior art, there is developed a local area having largelydifferent pixel values if the input image and the reference images arecaptured under different conditions. For example, if the input image andthe reference images are captured under different illuminatingconditions, then shadows are produced in different areas. According tothe second prior art, even if a comparative image is generated which isclosest to an input image, they have different local areas because ofobserving errors in three-dimensional object measurement and asimplified process of generating comparative images.

The third prior art is problematic in that the matching accuracy isreduced if object poses and illuminating conditions in an input imageand reference images are different from each other.

The reasons for the problem of the third prior art are that weightingcoefficients are established for areas of an object, and if poseconditions are different, then the object has a misaligned area, makingit impossible to perform proper weighted matching. Furthermore, whenilluminating conditions are different, an area that is important formatching often changes. However, since the weighting coefficient remainsthe same, appropriate weighted matching cannot be performed.

DISCLOSURE OF THE INVENTION

It is an object of the present invention to provide an object poseestimating and matching system for estimating and matching the pose ofan object highly accurately by establishing suitable weightingcoefficients depending on the pose even if a large local difference isdeveloped in the comparison between an input image and images from a DB.

Another object of the present invention is to provide an object poseestimating and matching system for estimating and matching the pose ofan object by establishing suitable weighting coefficients depending onvariations that may occur as deformations and illuminating conditionvariations.

According to a first aspect of the present invention, an object poseestimating and matching system comprises:

reference three-dimensional object model storage means for storing, inadvance, reference three-dimensional object models of objects;

reference three-dimensional weighting coefficient storage means forstoring, in advance, reference three-dimensional weighting coefficientscorresponding to said reference three-dimensional object models;

pose candidate determining means for determining pose candidates for anobject;

comparative image generating means for generating comparative imagesclose to an input image depending on said pose candidates, based on saidreference three-dimensional object models;

weighting coefficient converting means for converting said referencethree-dimensional weighting coefficients into two-dimensional weightingcoefficients depending on said pose candidates, using said referencethree-dimensional object models; and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said input image and saidcomparative images, using said two-dimensional weighting coefficients,and selecting one of the comparative images whose distance value up tosaid object is the smallest or whose similarity degree with respect tosaid object is the greatest, thereby to estimate and match the pose ofsaid object.

Three-dimensional weighting coefficients corresponding tothree-dimensional object models are generated and stored. For matchingthe input image, comparative images are generated from referencethree-dimensional object models depending on pose candidates, and thethree-dimensional weighting coefficients are converted intotwo-dimensional weighting coefficients, so that weighted distances arecalculated. Therefore, highly accurate pose estimation and matching canbe performed by setting appropriate weighting coefficients depending onposes.

According to a second aspect of the present invention, an object poseestimating and matching system comprises:

reference three-dimensional object model storage means for storing, inadvance, reference three-dimensional object models of objects;

standard three-dimensional weighting coefficient storage means forstoring, in advance, standard three-dimensional weighting coefficients;

reference three-dimensional basic point storage means for storing, inadvance, reference three-dimensional basic points corresponding to saidreference three-dimensional object models;

standard three-dimensional basic point storage means for storing, inadvance, standard three-dimensional basic points corresponding tostandard three-dimensional object models;

pose candidate determining means for determining pose candidates for anobject;

comparative image generating means for generating comparative imagesclose to an input image depending on said pose candidates, based on saidreference three-dimensional object models;

weighting coefficient converting means for determining a coordinatecorrespondence between said standard three-dimensional weightingcoefficients and said reference three-dimensional object models, usingsaid standard three-dimensional basic points and said referencethree-dimensional basic points, and converting said standardthree-dimensional weighting coefficients into two-dimensional weightingcoefficients depending on said pose candidates; and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said input image and saidcomparative images, using said two-dimensional weighting coefficients,and selecting one of the comparative images whose distance value up tosaid object is the smallest or whose similarity degree with respect tosaid object is the greatest, thereby to estimate and match the pose ofsaid object.

Comparative images are generated from reference three-dimensional objectmodels depending on pose candidates, and three-dimensional weightingcoefficients are converted into two-dimensional weighting coefficients,so that weighted distances are calculated. Therefore, highly accuratepose estimation and matching can be performed by setting appropriateweighting coefficients depending on poses.

According to a third aspect of the present invention, an object poseestimating and matching system comprises:

reference three-dimensional object model storage means for storing, inadvance, reference three-dimensional object models of objects;

variation-specific reference three-dimensional weighting coefficientstorage means for storing, in advance, reference three-dimensionalweighting coefficients corresponding to said reference three-dimensionalobject models and image variations;

pose candidate determining means for determining pose candidates for anobject;

variation estimating means for determining a correspondence between anarea of a three-dimensional object model and an input image, using saidpose candidates and said reference three-dimensional object models, andestimating a variation based on image information of a given area ofsaid input image;

comparative image generating means for generating comparative imagesclose to an input image depending on said pose candidates, based on saidreference three-dimensional object models;

weighting coefficient converting means for converting said referencethree-dimensional weighting coefficients corresponding to the estimatedvariation into two-dimensional weighting coefficients depending on saidpose candidates, using said reference three-dimensional object models;and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said input image and saidcomparative images, using said two-dimensional weighting coefficients,and selecting one of the comparative images whose distance value up tosaid object is the smallest or whose similarity degree with respect tosaid object is the greatest, thereby to estimate and match the pose ofsaid object.

Comparative images are generated from reference three-dimensional objectmodels depending on pose candidates, and three-dimensional weightingcoefficients are converted into two-dimensional weighting coefficients,so that weighted distances are calculated. Therefore, highly accuratepose estimation and matching can be performed by setting appropriateweighting coefficients depending on poses. Moreover, variation-specificthree-dimensional weighting coefficients corresponding variations whichcan occur in the input image are held, a variation is estimated from theinput image, and a corresponding three-dimensional weighting coefficientis employed. Therefore, highly accurate pose estimation and matching canbe performed by setting appropriate weighting coefficients depending onvariations that may occur as object deformations and illuminatingcondition variations.

According to a fourth aspect of the present invention, an object poseestimating and matching system comprises:

reference three-dimensional object model storage means for storing, inadvance, reference three-dimensional object models of objects;

variation-specific standard three-dimensional weighting coefficientstorage means for storing, in advance, standard three-dimensionalweighting coefficients corresponding to image variations;

reference three-dimensional basic point storage means for storing, inadvance, reference three-dimensional basic points corresponding to saidreference three-dimensional object models;

standard three-dimensional basic point storage means for storing, inadvance, standard three-dimensional basic points corresponding tostandard three-dimensional object models;

pose candidate determining means for determining pose candidates for anobject;

variation estimating means for determining a correspondence between anarea of a three-dimensional object model and an input image, using saidpose candidates and said reference three-dimensional object models, andestimating a variation based on image information of a given area ofsaid input image;

comparative image generating means for generating comparative imagesclose to an input image depending on said pose candidates, based on saidreference three-dimensional object models;

weighting coefficient converting means for determining a coordinatecorrespondence between said standard three-dimensional weightingcoefficients corresponding to the estimated variation and said referencethree-dimensional object models, using said standard three-dimensionalbasic points and said reference three-dimensional basic points, andconverting said standard three-dimensional weighting coefficients intotwo-dimensional weighting coefficients depending on said posecandidates; and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said input image and saidcomparative images, using said two-dimensional weighting coefficients,and selecting one of the comparative images whose distance value up tosaid object is the smallest or whose similarity degree with respect tosaid object is the greatest, thereby to estimate and match the pose ofsaid object.

Comparative images are generated from reference three-dimensional objectmodels depending on pose candidates, and three-dimensional weightingcoefficients are converted into two-dimensional weighting coefficients,so that weighted distances are calculated. Therefore, highly accuratepose estimation and matching can be performed by setting appropriateweighting coefficients depending on poses. Moreover, variation-specificthree-dimensional weighting coefficients corresponding variations whichcan occur in the input image are held, a variation is estimated from theinput image, and a corresponding three-dimensional weighting coefficientis employed. Therefore, highly accurate pose estimation and matching canbe performed by setting appropriate weighting coefficients depending onvariations that may occur as object deformations and illuminatingcondition variations.

According to a fifth aspect of the present invention, an object poseestimating and matching system comprises:

pose-specific reference image storage means for storing, in advance,pose-specific reference images of an object;

pose-specific reference weighting coefficient storage means for storing,in advance, pose-specific reference weighting coefficients correspondingto said reference images;

normalizing means for normalizing an input image to generate anormalized image; and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said normalized image andsaid reference images, using said pose-specific weighting coefficients,and selecting one of the reference images whose distance value up tosaid object is the smallest or whose similarity degree with respect tosaid object is the greatest, thereby to estimate and match the pose ofsaid object.

Weighted distances are calculated using pose-specific weightingcoefficients corresponding to pose-specific reference images. Therefore,highly accurate pose estimation and matching can be performed by settingappropriate weighting coefficients depending on poses.

According to a sixth aspect of the present invention, an object poseestimating and matching system comprises:

pose-specific reference image storage means for storing, in advance,pose-specific reference images of an object;

pose-specific standard weighting coefficient storage means for storing,in advance, pose-specific standard weighting coefficients;

normalizing means for normalizing an input image to generate anormalized image; and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said normalized image andsaid reference images, using said pose-specific weighting coefficients,and selecting one of the reference images whose distance value up tosaid object is the smallest or whose similarity degree with respect tosaid object is the greatest, thereby to estimate and match the pose ofsaid object.

Weighted distances are calculated using pose-specific weightingcoefficients corresponding to pose-specific reference images. Therefore,highly accurate pose estimation and matching can be performed by settingappropriate weighting coefficients depending on poses.

According to a seventh aspect of the present invention, an object poseestimating and matching system comprises:

pose-specific reference image storage means for storing, in advance,pose-specific reference images of an object;

pose- and variation-specific reference weighting coefficient storagemeans for storing, in advance, pose- and variation-specific referenceweighting coefficients corresponding to said reference images and imagevariations;

standard three-dimensional object model storage means for storing, inadvance, standard three-dimensional object models;

normalizing means for normalizing an input image to generate anormalized image;

variation estimating means for determining a correspondence between anarea of a three-dimensional object model and the normalized image, usingpose information of said reference images and said standardthree-dimensional object models, and estimating a variation based onimage information of a given area of said normalized image; and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said normalized image andsaid reference images, using the pose information of said referenceimages and said pose- and variation-specific weighting coefficientscorresponding to the estimated variation, and selecting one of thereference images whose distance value up to said object is the smallestor whose similarity degree with respect to said object is the greatest,thereby to estimate and match the pose of said object.

Weighted distances are calculated using pose-specific weightingcoefficients corresponding to pose-specific reference images. Therefore,highly accurate pose estimation and matching can be performed by settingappropriate weighting coefficients depending on poses. Moreover, pose-and variation-specific weighting coefficients corresponding variationswhich can occur in the input image are held, a variation is estimatedfrom the normalized image, and a corresponding weighting coefficient isemployed. Therefore, highly accurate pose estimation and matching can beperformed by setting appropriate weighting coefficients depending onvariations that may occur as object deformations and illuminatingcondition variations.

According to an eighth aspect of the present invention, an object poseestimating and matching system comprises:

pose-specific reference image storage means for storing, in advance,pose-specific reference images of an object;

pose- and variation-specific standard weighting coefficient storagemeans for storing, in advance, pose- and variation-specific standardweighting coefficients corresponding to image variations;

standard three-dimensional object model storage means for storing, inadvance, standard three-dimensional object models;

normalizing means for normalizing an input image to generate anormalized image;

variation estimating means for determining a correspondence between anarea of a three-dimensional object model and the normalized image, usingpose information of said reference images and said standardthree-dimensional object models, and estimating a variation based onimage information of a given area of said normalized image; and

weighted matching and pose selecting means for calculating weighteddistance values or similarity degrees between said normalized image andsaid reference images, using the pose information of said referenceimages and said pose- and variation-specific weighting coefficientscorresponding to the estimated variation, and selecting one of thereference images whose distance value up to said object is the smallestor whose similarity degree with respect to said object is the greatest,thereby to estimate and match the pose of said object.

Weighted distances are calculated using pose-specific weightingcoefficients corresponding to pose-specific reference images. Therefore,highly accurate pose estimation and matching can be performed by settingappropriate weighting coefficients depending on poses. Moreover, pose-and variation-specific weighting coefficients corresponding tovariations which can occur in the input image are held, a variation isestimated from the normalized image, and a corresponding weightingcoefficient is employed. Therefore, highly accurate pose estimation andmatching can be performed by setting appropriate weighting coefficientsdepending on variations that may occur as object deformations andilluminating condition variations.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of an arrangement according to a first priorart;

FIG. 2 is a block diagram of an arrangement according to a second priorart;

FIG. 3 is a block diagram of an arrangement according to a third priorart;

FIG. 4 is a block diagram of an arrangement of an object pose estimatingand matching system according to a first embodiment of the presentinvention;

FIG. 5 is a flowchart of an operation sequence (pose estimation) of thefirst embodiment;

FIG. 6 is a flowchart of an operation sequence (one-to-one matching) ofthe first embodiment;

FIG. 7 is a flowchart of an operation sequence (one-to-N matching) ofthe first embodiment;

FIG. 8 is a flowchart of an operation sequence (registration) of thefirst embodiment;

FIG. 9 is a diagram showing a specific example of coordinates of athree-dimensional object model according to the first embodiment;

FIG. 10 is a diagram showing a specific example of referencethree-dimensional object models according to the first embodiment;

FIG. 11 is a diagram showing a specific example of referencethree-dimensional weighting coefficients according to the firstembodiment;

FIG. 12 is a diagram showing a specific example of an input imageaccording to the first embodiment;

FIG. 13 is a diagram showing a specific example of comparative imagesaccording to the first embodiment;

FIG. 14 is a diagram showing a specific example of two-dimensionalweighting coefficients according to the first embodiment;

FIG. 15 is a diagram showing a specific example of learning imagesaccording to the first embodiment;

FIG. 16 is a diagram showing a specific example of comparative imagesaccording to the first embodiment;

FIG. 17 is a block diagram of an arrangement of an object poseestimating and matching system according to a second embodiment of thepresent invention;

FIG. 18 is a flowchart of an operation sequence (pose estimation) of thesecond embodiment;

FIG. 19 is a diagram showing a specific example of a standardthree-dimensional weighting coefficient according to the secondembodiment;

FIG. 20 is a diagram showing a specific example of standardthree-dimensional basic points according to the second embodiment;

FIG. 21 is a diagram showing a specific example of referencethree-dimensional basic points according to the second embodiment;

FIG. 22 is a diagram showing a specific example of two-dimensionalweighting coefficients according to the second embodiment;

FIG. 23 is a block diagram of an arrangement of an object poseestimating and matching system according to a third embodiment of thepresent invention;

FIG. 24 is a flowchart of an operation sequence (pose estimation) of thethird embodiment;

FIG. 25 is a flowchart of an operation sequence (registration) of thethird embodiment;

FIG. 26 is a diagram showing a specific example of variations(illuminating conditions) according to the third embodiment;

FIG. 27 is a diagram showing a specific example of variation-specificreference three-dimensional weighting coefficients according to thethird embodiment;

FIG. 28 is a diagram showing a specific example of an input imageaccording to the third embodiment;

FIG. 29 is a diagram showing a specific example of comparative imagesaccording to the third embodiment;

FIG. 30 is a diagram showing a specific example of two-dimensionalweighting coefficients according to the third embodiment;

FIG. 31 is a block diagram of an arrangement of an object poseestimating and matching system according to a fourth embodiment of thepresent invention;

FIG. 32 is a flowchart of an operation sequence (pose estimation) of thefourth embodiment;

FIG. 33 is a diagram showing a specific example of variation-specificreference three-dimensional weighting coefficients according to thefourth embodiment;

FIG. 34 is a diagram showing a specific example of two-dimensionalweighting coefficients according to the fourth embodiment;

FIG. 35 is a block diagram of an arrangement of an object poseestimating and matching system according to a fifth embodiment of thepresent invention;

FIG. 36 is a flowchart of an operation sequence (pose estimation) of thefifth embodiment;

FIG. 37 is a flowchart of an operation sequence (registration) of thefifth embodiment;

FIG. 38 is a diagram showing a specific example of pose-specificreference images according to the fifth embodiment;

FIG. 39 is a diagram showing a specific example of pose-specificreference weighting coefficients according to the fifth embodiment;

FIG. 40 is a diagram showing a specific example of a normalized imageaccording to the fifth embodiment;

FIG. 41 is a block diagram of an arrangement of an object poseestimating and matching system according to a sixth embodiment of thepresent invention;

FIG. 42 is a flowchart of an operation sequence (pose estimation) of thesixth embodiment;

FIG. 43 is a diagram showing a specific example of pose-specificstandard weighting coefficients according to the sixth embodiment;

FIG. 44 is a block diagram of an arrangement of an object poseestimating and matching system according to a seventh embodiment of thepresent invention;

FIG. 45 is a flowchart of an operation sequence (pose estimation) of theseventh embodiment;

FIG. 46 is a flowchart of an operation sequence (registration) of theseventh embodiment;

FIG. 47 is a diagram showing a specific example of pose- andvariation-specific reference weighting coefficients according to theseventh embodiment;

FIG. 48 is a block diagram of an arrangement of an object poseestimating and matching system according to an eighth embodiment of thepresent invention;

FIG. 49 is a flowchart of an operation sequence (pose estimation) of theeighth embodiment; and

FIG. 50 is a diagram showing a specific example of pose- andvariation-specific standard weighting coefficients according to theeighth embodiment.

BEST MODE FOR CARRYING OUT THE INVENTION 1st Embodiment

Referring to FIG. 4, an object pose estimating and matching systemaccording to a first embodiment of the present invention comprises imageinput unit 10, comparative image generator 20, pose candidatedetermining unit 30, weighted matching and pose selecting unit 40,weighting coefficient converter 60, reference three-dimensional objectmodel storage unit 55, reference three-dimensional weighting coefficientstorage unit 65, and registration unit 2. Registration unit 2 comprisesthree-dimensional object model register 50, matching and pose selectingunit 41, and three-dimensional weighting coefficient generator 62.

Image input unit 10, comparative image generator 20, pose candidatedetermining unit 30, reference three-dimensional object model storageunit 55, and matching and pose selecting unit 41 operate in the samemanner as the components denoted by the identical reference numeralsaccording to the second prior art shown in FIG. 2.

Reference three-dimensional weighting coefficient storage unit 65 storesreference three-dimensional weighting coefficients corresponding to thereference three-dimensional object models of respective objects.

Weighting coefficient converter 60 converts reference three-dimensionalweighting coefficients obtained from reference three-dimensionalweighting coefficient storage unit 65 into two-dimensional weightingcoefficients depending on the pose candidates obtained from posecandidate determining unit 30, using the reference three-dimensionalobject models obtained from reference three-dimensional object modelstorage unit 55.

Weighted matching and pose selecting unit 40 calculates weighteddistance values (or similarity degrees) between the input image obtainedfrom image input unit 10 and the comparative images depending onrespective pose candidates obtained from comparative image generator 20,using the two-dimensional weighting coefficients obtained from weightingcoefficient converter 60, and selects a comparative image (posecandidate) whose distance value up to the model (object) is thesmallest, thereby estimating an optimum pose.

For matching the input image against one object (one-to-one matching),as with the first prior art, the minimum distance value is furthercompared with a threshold value to determine whether the input image isthe same as the object or not. For searching a plurality of objects foran object that is closest to the input image (one-to-N matching), theobject whose distance value is the smallest among the minimum distancevalues determined up to the respective objects is extracted.

Three-dimensional object model register 50 registers referencethree-dimensional object models in reference three-dimensional objectmodel storage unit 55.

Three-dimensional weighting coefficient generator 62 generates referencethree-dimensional weighting coefficients by learning the degree ofimportance in matching of each pixel on the three-dimensional modelbased on a pixel correspondence between the reference three-dimensionalobject models obtained from reference three-dimensional object modelstorage unit 55, the two-dimensional image determined by the optimumpose, and the three-dimensional model, using the comparative image ofthe optimum pose obtained from matching and pose selecting unit 41 andthe input image, and registers the generated reference three-dimensionalweighting coefficients in reference three-dimensional weightingcoefficient storage unit 65.

Overall operation of the present embodiment for pose estimation will bedescribed in detail below with reference to FIG. 4 and a flowchart shownin FIG. 5.

First, an input image of a model (object) is obtained by image inputunit 10 (step 100 in FIG. 5). Then, pose candidate determining unit 30determines a pose candidate group {e_(j)} (step 110). Then, comparativeimage generator 20 generates comparative images having illuminatingconditions close to those of the input image, with respect to therespective pose candidates, based on reference three-dimensional objectmodels C_(k) obtained from reference three-dimensional object modelstorage unit 55 (step 120). Weighting coefficient converter 60 convertsreference three-dimensional weighting coefficients obtained fromreference three-dimensional weighting coefficient storage unit 65 intotwo-dimensional weighting coefficients depending on the pose candidates,using the reference three-dimensional object models (step 130). Finally,weighted matching and pose selecting unit 40 calculates weighteddistance values D_(kj) (or similarity degrees) between the input imageand the comparative images, using the two-dimensional weightingcoefficients (step 140), and selects a comparative image (posecandidate) whose distance value up to the model (object) for the inputimage is the smallest, thereby estimating an optimum pose (step 150).

In the above flowchart, the pose candidate whose distance value is thesmallest is selected from the predetermined pose candidate group.However, control may return to pose candidate determining unit 30 tosearch for the pose candidate whose distance value is the smallest bysuccessively changing the pose candidates.

Overall operation of the present embodiment for one-to-one matching willbe described in detail below with reference to FIG. 4 and a flowchartshown in FIG. 6.

Steps 100 through 150 shown in FIG. 6 are identical to steps 100 through150 shown in FIG. 5. Finally, weighted matching and pose selecting unit40 compares the minimum distance value with the threshold value todetermine whether the input image is the same as the object or not (step160).

Overall operation of the present embodiment for one-to-N matching willbe described in detail below with reference to FIG. 4 and a flowchartshown in FIG. 7.

First, image input unit 10 produces an input image of a model (object)(step 100 in FIG. 7). Then, weighted matching and pose selecting unit 40sets a model number k=1 (step 170). Thereafter, steps that are identicalto steps 100 through 150 for pose estimation shown in FIG. 6 areexecuted for each model C_(k), determining a minimum distance valueaccording to an optimum pose for each model C_(k). Then, the modelnumber k is incremented by 1 (step 171). If k is equal to or smallerthan the number of models (step 172), then control goes back to step 110for calculating a minimum distance value for a next model. Finally, amodel C_(k) having the smallest minimum distance value is determined asthe result of the matching process (step 175).

Overall operation of the present embodiment for registration will bedescribed in detail below with reference to FIG. 4 and a flowchart shownin FIG. 8.

First, three-dimensional object model register 50 registers referencethree-dimensional object models of objects C_(k) in referencethree-dimensional object model storage unit 55 (step 300 in FIG. 8).Then, three-dimensional weighting coefficient generator 62 first sets animage number h=1 (step 210) and then enters a learning image having theimage number h from image input unit 10 (step 200) for learningreference three-dimensional weighting coefficients using the learningimage and the reference three-dimensional object models. Then, posecandidate determining unit 30 determines a pose candidate group {e_(j)}(step 110). Then, comparative image generator 20 generates comparativeimages having illuminating conditions close to those of the input image,with respect to the respective pose candidates, based on the referencethree-dimensional object models C_(k) obtained from referencethree-dimensional object model storage unit 55 (step 120). Matching andpose selecting unit 41 calculates distance values D_(kj)′ (or similaritydegrees) between the input image and the comparative images (step 141),and selects one of the comparative images (pose candidates) whosedistance value up to the model (object) is the smallest, therebyestimating an optimum pose (step 150). Then, three-dimensional weightingcoefficient generator 62 increments the image number h by 1 (step 211).If the image number h is equal to or smaller than the number N oflearning images (step 212), then control goes back to step 200 fordetermining a comparative image having an optimum pose which correspondsto a next learning image. If the image number h is greater than thenumber N of learning images, then three-dimensional weightingcoefficient generator 62 generates reference three-dimensional weightingcoefficients by learning the degree of importance in matching of eachpixel on the three-dimensional model based on a pixel correspondencebetween the reference three-dimensional object models, thetwo-dimensional image determined by the optimum pose, and thethree-dimensional model, using the comparative images of the optimumposes which correspond to all the learning images (step 220). Finally,three-dimensional weighting coefficient generator 62 registers thegenerated reference three-dimensional weighting coefficients inreference three-dimensional weighting coefficient storage unit 65 (step230).

Advantages of the first embodiment will be described below.

According to the present embodiment, three-dimensional weightingcoefficients corresponding to three-dimensional object models aregenerated and stored. For matching the input image, comparative imagesare generated from reference three-dimensional object models dependingon pose candidates, and the three-dimensional weighting coefficients areconverted into two-dimensional weighting coefficients, so that weighteddistances are calculated. Therefore, highly accurate pose estimation andmatching can be performed by setting appropriate weighting coefficientsdepending on poses.

According to the present invention, furthermore, since only onethree-dimensional weighting coefficient is used for all poses, anappropriate three-dimensional weighting coefficient depending on adesired pose can be established in a smaller storage capacity than iftwo-dimensional weighting coefficients are to be held for respectiveposes.

According to the present embodiment, furthermore, because the degree ofimportance in matching of each pixel is learned on the three-dimensionalmodel, an appropriate three-dimensional weighting coefficient dependingon a desired pose can be determined with fewer learning images thanlearning images corresponding to all poses.

A specific example of operation of the first embodiment will bedescribed below. In the specific example to be described below, the faceof a person will be described as an example of an object. However, thefirst embodiment is also applicable to other objects.

As shown in FIG. 10, reference three-dimensional object model storageunit 55 stores reference three-dimensional object models(three-dimensional shapes and textures) of objects C_(k).Three-dimensional object models can be generated by, for example, athree-dimensional shape measuring apparatus disclosed in JP-2001-12925Aor an apparatus for restoring a three-dimensional shape from a pluralityof images captured by a number of cameras disclosed in JP-H09-91436A.

As shown in FIG. 9, a three-dimensional object model has informationrepresenting a shape P_(Q) (x,y,z) and a texture T_(Q) (R,G,B) in athree-dimensional space (x,y,z) of an object surface. Q indicates anindex of a point on the object surface. For example, an index Qcorresponds to the coordinates of a point Q (s,t) projected from a pointon the object surface onto a spherical body having at its center thecenter of gravity of the object, along a line from the center of gravityof the object. For matching purposes, three-dimensional object modelsare used to generate learning CG images under various illuminatingconditions according to computer graphics, and the learning CG imagesare analyzed for their main components, thereby determining a basicimage group.

As shown in FIG. 11, reference three-dimensional weighting coefficientstorage unit 65 stores reference three-dimensional weightingcoefficients V_(Q) ^(k) for objects. For example, the referencethree-dimensional weighting coefficient has a value of V_(Q) ^(k)=1 fora black area, a value of V_(Q) ^(k)=0 for a white area, and a value inthe range of 0<V_(Q) ^(k)<1 for a gray area.

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 12 isobtained by image input unit 10 (step 100 shown in FIG. 5). Posecandidate determining unit 30 determines a pose candidate group {e_(j)}(step 110).

The pose candidate group may be preset irrespectively of the inputimage. However, reference points such as eyes, a nose, a mouth, etc. maymanually or automatically be extracted from the input image and thethree-dimensional models, and an appropriate pose may be estimatedaccording to a process for calculating the position and orientation ofan object as disclosed in JP-2001-283229A. It is efficient to generate apose candidate group in the vicinity of such an estimated pose.

Comparative image generator 20 generates comparative images G_(1j)(r)having illuminating conditions close to those of the input image, withrespect to the respective pose candidates e_(j), based on the referencethree-dimensional object model C₁ (step 120).

A comparative image having illuminating conditions close to those of theinput image is generated as follows: The basic image group that has beendetermined in advance is coordinate-transformed based on each posecandidate, and coefficients of the linear sum of thecoordinate-transformed basic images are determined according to theleast-square method so that the linear sum will be close to the inputimage. An example of comparative images generated with respect to theobject C₁ is shown in FIG. 13.

Weighting coefficient converter 60 converts the referencethree-dimensional weighting coefficients V_(Q) ¹ obtained from referencethree-dimensional weighting coefficient storage unit 65 intotwo-dimensional weighting coefficients W_(1j)(r) depending on the posecandidates e_(j), using the reference three-dimensional object models C₁(step 130). An example of two-dimensional weighting coefficientsgenerated with respect to the object C₁ is shown in FIG. 14.

Weighted matching and pose selecting unit 40 calculates weighteddistance values D_(kj) between the input image I(r) and the comparativeimages G_(1j)(r), using the two-dimensional weighting coefficientsW_(1j)(r) (step 140). For example, if the Euclidean distance is used,then weighting is calculated according toD_(kj)=Σ_(r)W_(kj)(r){I(r)−G_(kj)(r)}², and if a similarity degreeS_(kj) is used, then weighting is calculated according toS_(kj)=exp(−D_(kj)).

Finally, weighted matching and pose selecting unit 40 selects acomparative image (pose candidate) whose distance value up to the modelC₁ is the smallest, thereby estimating an optimum pose (step 150). Forthe comparative images shown in FIG. 13, for example, a pose e₃ isselected as a comparative image whose distance value is the smallest.

A specific example of operation of the first embodiment for registrationwill be described below. The registration of the referencethree-dimensional object model of the objects C₁ will be describedbelow.

First, three-dimensional object model register 50 registers thereference three-dimensional object model of the object C₁ in referencethree-dimensional object model storage unit 55 (step 300 in FIG. 8). Itis assumed that three images as shown in FIG. 15 are obtained aslearning images (learning images are images captured under various poseconditions of the object C₁).

Then, three-dimensional weighting coefficient generator 62 first sets animage number h=1 (step 210) and then enters a learning image I^(h)(r)having the image number h from image input unit 10 (step 200) forlearning reference three-dimensional weighting coefficients using thelearning image and the reference three-dimensional object models.

Then, pose candidate determining unit 30 determines a pose candidategroup {e_(j)} (step 110). Comparative image generator 20 generatescomparative images G_(1j) ^(h)(r) having illuminating conditions closeto those of the input image, with respect to the respective posecandidates e_(j), based on the reference three-dimensional object modelC₁ obtained from reference three-dimensional object model storage unit55 (step 120).

Matching and pose selecting unit 41 calculates distance values D_(1j)^(h)′ (or similarity degrees) between the input image I^(h)(r) and thecomparative images G_(1j) ^(h)(r) (step 141). For example, if theEuclidean distance is used, then the distance values are calculatedaccording to D_(kj) ^(h)′=Σ_(r){I^(h)(r)−G_(kj) ^(h)(r)}².

Matching and pose selecting unit 41 selects a comparative image (posecandidate) whose distance value up to the model (object) is thesmallest, thereby determining an optimum pose (step 150). Then,three-dimensional weighting coefficient generator 62 increments theimage number h by 1 (step 211). If the image number h is equal to orsmaller than the number N=3 of learning images (step 212), then controlgoes back to step 200 for determining a comparative image having anoptimum pose which corresponds to a next learning image. An example ofcomparative images having optimum poses determined so as to correspondto the respective learning images I^(h)(r) shown in FIG. 15 isillustrated in FIG. 16.

If the image number h is greater than the number N=3 of learning images,then three-dimensional weighting coefficient generator 62 generatesreference three-dimensional weighting coefficients by learning thedegree of importance in matching of each pixel on the three-dimensionalmodel based on a pixel correspondence between the referencethree-dimensional object models, the two-dimensional image determined bythe optimum pose, and the three-dimensional model, using the comparativeimages G_(1j) ^(h)(r) of the optimum poses which correspond to all thelearning images I^(h)(r) (step 220). For example, if an area where anerror between learning images and comparative images is small is an areathat is important for matching, then a weighting coefficient is definedas the reciprocal of the average error. The error between the learningimages I^(h)(r) and the comparative images G_(kj) ^(h)(r) astwo-dimensional images is calculated according to d_(kj)^(h)(r)=|I^(h)(r)−G_(kj) ^(h)(r)|.

If the relationship between coordinates (s,t) on a three-dimensionalobject model and the coordinate r on a two-dimensional image at the timecomparative image are generated based on the poses e_(j) from thethree-dimensional object model is represented by r=F(s,t), then thecoordinates of a point on the three-dimensional model which correspondsto the pixel r on the two-dimensional image are determined according to{s,t}=F⁻¹(r) according to an inverse transform. If the error d_(kj)^(h)(r) of each pixel is mapped onto the point {s,t} on thethree-dimensional model by the inverse transform and the average errorof all the learning images is represented by E_(Q) ^(k), thenthree-dimensional weighting coefficients are calculated according toV_(Q) ^(k)=A/E_(Q) ^(k) (A represents a normalizing coefficient). Thethree-dimensional weighting coefficients V_(Q) ^(k) are converted intotwo-dimensional weighting coefficients W_(kj)(r) according to thetransform r=F(s,t) obtained for each of the pose candidates e_(j).Finally, the three-dimensional weighting coefficient V_(Q) ¹ of theobject C₁ is registered in reference three-dimensional weightingcoefficient storage unit 65 (step 230).

2nd Embodiment

Referring to FIG. 17, an object pose estimating and matching systemaccording to a second embodiment of the present invention comprisesimage input unit 10, comparative image generator 20, pose candidatedetermining unit 30, weighted matching and pose selecting unit 40,weighting coefficient converter 61, reference three-dimensional objectmodel storage unit 55, standard three-dimensional weighting coefficientstorage unit 66, reference three-dimensional basic point storage unit75, standard three-dimensional basic point storage unit 76, and aregistration unit 4. Registration unit 4 comprises three-dimensionalobject model register 50 and three-dimensional basic point register 70.

Image input unit 10, comparative image generator 20, pose candidatedetermining unit 30, weighted matching and pose selecting unit 40,reference three-dimensional object model storage unit 55, andthree-dimensional object model register 50 in the same manner as thecomponents denoted by the identical reference numerals according to thefirst embodiment shown in FIG. 4.

Standard three-dimensional weighting coefficient storage unit 66 storesa standard three-dimensional weighting coefficient. Referencethree-dimensional basic point storage unit 75 stores referencethree-dimensional basic points corresponding to referencethree-dimensional object models of objects. Standard three-dimensionalbasic point storage unit 76 stores standard three-dimensional basicpoints corresponding to standard three-dimensional object models.

Weighting coefficient converter 61 determines a coordinatecorrespondence between the standard three-dimensional weightingcoefficient obtained from standard three-dimensional weightingcoefficient storage unit 66 and the reference three-dimensional objectmodels, using the standard three-dimensional basic points obtained fromstandard three-dimensional basic point storage unit 76 and the referencethree-dimensional basic points obtained from reference three-dimensionalbasic point storage unit 75, and converts the standard three-dimensionalweighting coefficients into two-dimensional weighting coefficientsdepending on the pose candidates obtained from pose candidatedetermining unit 30.

Three-dimensional basic point register 70 determines referencethree-dimensional basic points with respect to the referencethree-dimensional object models obtained from three-dimensional objectmodel register 50, and registers the determined three-dimensional basicpoints in reference three-dimensional basic point storage unit 75.

Overall operation of the second embodiment for pose estimation will bedescribed in detail below with reference to FIG. 17 and a flowchartshown in FIG. 18. Operation of the second embodiment for one-to-onematching and one-to-N matching is similar to the operation for poseestimation except for the added determining process (step 160 shown inFIG. 6) and the added process of determining a model having a minimumdistance value (steps 170 through 175 shown in FIG. 7), as with thefirst embodiment, and will not be described below.

First, an input image is obtained by image input unit 10 (step 100 inFIG. 18). Then, pose candidate determining unit 30 determines a posecandidate group {e_(j)} (step 110). Then, comparative image generator 20generates comparative images having illuminating conditions close tothose of the input image, with respect to the respective posecandidates, based on reference three-dimensional object models C_(k)obtained from reference three-dimensional object model storage unit 55(step 120).

Weighting coefficient converter 61 determines a coordinatecorrespondence between the standard three-dimensional weightingcoefficient obtained from standard three-dimensional weightingcoefficient storage unit 66 and the reference three-dimensional objectmodels, using the standard three-dimensional basic points obtained fromstandard three-dimensional basic point storage unit 76 and the referencethree-dimensional basic points obtained from reference three-dimensionalbasic point storage unit 75, and converts the standard three-dimensionalweighting coefficients into two-dimensional weighting coefficientsdepending on the pose candidates (step 131).

Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values D_(kj) (or similarity degrees) between theinput image and the comparative images, using the two-dimensionalweighting coefficients (step 140), and selects a comparative image (posecandidate) whose distance value up to the model (object) is thesmallest, thereby estimating an optimum pose (step 150).

Overall operation of the present embodiment for registration will bedescribed in detail below. First, three-dimensional object modelregister 50 registers reference three-dimensional object models ofobjects C_(k) in reference three-dimensional object model storage unit55. Then, three-dimensional basic point register 70 determines referencethree-dimensional basic points with respect to the referencethree-dimensional object models obtained from three-dimensional objectmodel register 50, and registers the determined three-dimensional basicpoints in reference three-dimensional basic point storage unit 75.

Advantages of the second embodiment will be described below.

According to the present embodiment, comparative images are generatedfrom reference three-dimensional object models depending on posecandidates, and three-dimensional weighting coefficients are convertedinto two-dimensional weighting coefficients, so that weighted distancesare calculated. Therefore, highly accurate pose estimation and matchingcan be performed by setting appropriate weighting coefficients dependingon poses.

According to the present invention, furthermore, since only onethree-dimensional weighting coefficient is used for all poses, anappropriate three-dimensional weighting coefficient depending on adesired pose can be established in a smaller storage capacity than iftwo-dimensional weighting coefficients are to be held for respectiveposes.

According to the present embodiment, furthermore, because a standardthree-dimensional weighting coefficient representing an average ofthree-dimensional weighting coefficients of a plurality of objects isheld, the storage capacity for storing the standard three-dimensionalweighting coefficient is much smaller than if referencethree-dimensional weighting coefficients are to be held for objects. Itis not necessary to capture learning images corresponding to referencethree-dimensional object models upon registration.

A specific example of operation of the second embodiment will bedescribed below.

As shown in FIG. 10, reference three-dimensional object model storageunit 55 stores reference three-dimensional object models(three-dimensional shapes P_(Q) ^(k) (x,y,z) and textures T_(Q) ^(k)(R,G,B)) of objects C_(k). As shown in FIG. 19, standardthree-dimensional weighting coefficient storage unit 66 stores astandard three-dimensional weighting coefficient V_(Q) ⁰.

Furthermore, as shown in FIG. 20, standard three-dimensional basic pointstorage unit 76 stores the coordinates of standard three-dimensionalbasic points N_(i) ⁰. As shown in FIG. 21, reference three-dimensionalbasic point storage unit 75 stores the coordinates of referencethree-dimensional basic points N_(i) ^(k). Basic points are points forpositional alignment, and refer to five points including a left eyemidpoint, a right eye midpoint, a nose top point, a left mouth cornerpoint, and a right mouth corner point in FIGS. 20 and 21.

The reference three-dimensional basic points may be manually preset.However, the reference three-dimensional basic points may beautomatically set according to a facial feature extracting processdisclosed in Marugame and Sakamoto “Extraction of feature areas fromfacial three-dimensional data using shape information and colorinformation”, FIT (Forum of Information science Technology), 2002,I-100, pages 199-200. The standard three-dimensional basic points may bedetermined from average coordinates of the reference three-dimensionalbasic points or three-dimensional basic points of three-dimensionalobject models prepared in advance for learning purposes. The standardthree-dimensional weighting coefficient may be determined bypositionally aligning reference three-dimensional weighting coefficientsor three-dimensional weighting coefficients of three-dimensional objectmodels prepared in advance for learning purposes so that thethree-dimensional basic points will be aligned with the standardthree-dimensional basic points, and then averaging the positionallyaligned three-dimensional weighting coefficients.

For positionally aligning points other than the basic points, transformequations s₀=Hs(s,t), t₀=Ht(s,t) for the coordinates {s,t} of thethree-dimensional weighting coefficients and the coordinates {s₀,t₀} ofthe standard three-dimensional weighting coefficient can be establishedby determining a correspondence between basic points according tointerpolation or extrapolation. The standard three-dimensional weightingcoefficient can be generated by directly mapping errors of pixels ontothe standard three-dimensional model, using learning images.

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 12 isobtained by image input unit 10 (step 100 shown in FIG. 18). Posecandidate determining unit 30 determines a pose candidate group {e_(j)}(step 110). Comparative image generator 20 generates comparative imagesG_(1j)(r) having illuminating conditions close to those of the inputimage, with respect to the respective pose candidates e_(j), based onthe reference three-dimensional object model C₁ (step 120). An exampleof comparative images generated with respect to the object C₁ is shownin FIG. 13.

Weighting coefficient converter 61 determines a coordinatecorrespondence between the standard three-dimensional weightingcoefficient V_(Q) ⁰ and the reference three-dimensional object modelsP_(Q) ^(k), using the standard three-dimensional basic points N_(i) ⁰and the reference three-dimensional basic points N_(i) ^(k), andconverts the standard three-dimensional weighting coefficient V_(Q) ⁰into a two-dimensional weighting coefficient W_(1j)(r) depending on thepose candidates e_(j) (step 131). When a coordinate correspondence isdetermined between the standard three-dimensional weighting coefficientV_(Q) ⁰ and the reference three-dimensional object models P_(Q) ^(k)(more precisely, textures T_(Q) ^(k)), reference three-dimensionalweighting coefficients V_(Q) ^(k) can hypothetically be generated(actually, reference three-dimensional weighting coefficients V_(Q) ^(k)are not generated, but a two-dimensional weighting coefficient isdirectly generated from the standard three-dimensional weightingcoefficient). The standard three-dimensional basic points N_(i) ⁰ andthe reference three-dimensional basic points N_(i) ^(k) are used inorder to determine a correspondence between the standardthree-dimensional weighting coefficient V_(Q) ⁰ and the textures T_(Q)^(k) of the reference three-dimensional object models. The coordinates{s₀,t₀} of standard three-dimensional weighting coefficient whichcorrespond to the coordinates r in the two-dimensional image aredetermined according to s₀=Hs(F⁻¹(r)), t₀=Ht(F⁻¹(r)). An example oftwo-dimensional weighting coefficients generated with respect to theobject C₁ is shown in FIG. 18.

Weighted matching and pose selecting unit 40 calculates weighteddistance values D_(1j) between the input image I(r) and the comparativeimages G_(1j)(r), using the two-dimensional weighting coefficientsW_(1j)(r) (step 140). Finally, weighted matching and pose selecting unit40 selects a comparative image (pose candidate) whose distance value upto the model C₁ is the smallest, according to D₁=min_(j)D_(1j), therebyestimating an optimum pose (step 150).

3rd Embodiment

Referring to FIG. 23, an object pose estimating and matching systemaccording to a third embodiment of the present invention comprises imageinput unit 10, comparative image generator 20, pose candidatedetermining unit 30, weighted matching and pose selecting unit 40,weighting coefficient converter 60, reference three-dimensional objectmodel storage unit 55, variation estimator 35, variation-specificreference three-dimensional weighting coefficient storage unit 67, and aregistration unit 3. Registration unit 3 comprises three-dimensionalobject model register 50, matching and pose selecting unit 41, andvariation-specific three-dimensional weighting coefficient generator 63.

Image input unit 10, comparative image generator 20, pose candidatedetermining unit 30, weighted matching and pose selecting unit 40,weighting coefficient converter 60, reference three-dimensional objectmodel storage unit 55, three-dimensional object model register 50, andmatching and pose selecting unit 41 operate in the same manner as thecomponents denoted by the identical reference numerals according to thefirst embodiment shown in FIG. 4.

Variation-specific reference three-dimensional weighting coefficientstorage unit 67 stores reference three-dimensional weightingcoefficients corresponding to reference three-dimensional object modelsand image variations. Variation estimator 35 determines a correspondencebetween the input image obtained from image input unit 10 and an area ofa three-dimensional object model, using the pose candidates obtainedfrom pose candidate determining unit 30 and the referencethree-dimensional object models obtained from referencethree-dimensional object model storage unit 55, and estimates avariation based on image information of a given area. Furthermore,variation estimator 35 sends a reference weighting coefficient based onthe estimated variation, among the variation-specific referenceweighting coefficients stored in variation-specific referencethree-dimensional weighting coefficient storage unit 67, to weightingcoefficient converter 61.

Variation-specific three-dimensional weighting coefficient generator 63generates variation-specific reference three-dimensional weightingcoefficients by learning the degree of importance in matching of eachpixel on the three-dimensional model, for each image variation obtainedfrom variation estimator 35, based on a pixel correspondence between thereference three-dimensional object models obtained from referencethree-dimensional object model storage unit 55, the two-dimensionalimage determined by the optimum pose, and the three-dimensional model,using the comparative image of the optimum pose obtained from matchingand pose selecting unit 41 and the input image, and registers thegenerated reference three-dimensional weighting coefficients invariation-specific reference three-dimensional weighting coefficientstorage unit 67.

Overall operation of the third embodiment for pose estimation will bedescribed in detail below with reference to FIG. 23 and a flowchartshown in FIG. 24.

First, an input image of a model (object) is obtained by image inputunit 10 (step 100 in FIG. 24). Then, pose candidate determining unit 30determines a pose candidate group {e_(j)} (step 110). Then, comparativeimage generator 20 generates comparative images having illuminatingconditions close to those of the input image, with respect to therespective pose candidates, based on reference three-dimensional objectmodels C_(k) obtained from reference three-dimensional object modelstorage unit 55 (step 120).

Variation estimator 35 determines a correspondence between an area of athree-dimensional object model and the input image, using the posecandidates e_(j) and the reference three-dimensional object modelsC_(k), and estimates a variation b based on image information of a givenarea. Furthermore, variation estimator 35 sends a reference weightingcoefficient based on the estimated variation b, among thevariation-specific reference weighting coefficients stored invariation-specific reference three-dimensional weighting coefficientstorage unit 67, to weighting coefficient converter 61 (step 180).

Weighting coefficient converter 60 converts reference three-dimensionalweighting coefficients of the variation b obtained fromvariation-specific reference three-dimensional weighting coefficientstorage unit 67 into two-dimensional weighting coefficients depending onthe pose candidates, using the reference three-dimensional object models(step 130). Finally, weighted matching and pose selecting unit 40calculates weighted distance values D_(kj) (or similarity degrees)between the input image and the comparative images, using thetwo-dimensional weighting coefficients (step 140), and selects acomparative image (pose candidate) whose distance value up to the model(object) for the input image is the smallest, thereby estimating anoptimum pose (step 150).

Overall operation of the present embodiment for registration will bedescribed in detail below with reference to FIG. 23 and a flowchartshown in FIG. 25.

First, three-dimensional object model register 50 registers referencethree-dimensional object models of objects C_(k) in referencethree-dimensional object model storage unit 55 (step 300 in FIG. 25).

Then, variation-specific three-dimensional weighting coefficientgenerator 63 first sets an image number h=1 (step 210) and then enters alearning image having the image number h from image input unit 10 (step200) for learning reference three-dimensional weighting coefficientsusing the learning image and the reference three-dimensional objectmodels.

Then, pose candidate determining unit 30 determines a pose candidategroup {e_(j)} (step 110). Then, comparative image generator 20 generatescomparative images having illuminating conditions close to those of theinput image, with respect to the respective pose candidates, based onthe reference three-dimensional object models C_(k) obtained fromreference three-dimensional object model storage unit 55 (step 120).

Matching and pose selecting unit 41 calculates distance values D_(kj)′(or similarity degrees) between the input image and the comparativeimages (step 141), and selects one of the comparative images (posecandidates) whose distance value up to the model (object) is thesmallest, thereby estimating an optimum pose (step 150).

Then, variation estimator 35 determines a correspondence between an areaof a three-dimensional object model and the input image, using the posecandidates e_(j) and the reference three-dimensional object modelsC_(k), and estimates a variation b based on image information of a givenarea (step 180).

Then, variation-specific three-dimensional weighting coefficientgenerator 63 increments the image number h by 1 (step 211). If the imagenumber h is equal to or smaller than the number N of learning images(step 212), then control goes back to step 200 for determining acomparative image having an optimum pose which corresponds to a nextlearning image.

If the image number h is greater than the number N of learning images,then variation-specific three-dimensional weighting coefficientgenerator 63 generates variation-specific reference three-dimensionalweighting coefficients by learning the degree of importance in matchingof each pixel on the three-dimensional model, for each image variation bobtained from variation estimator 35, based on a pixel correspondencebetween the reference three-dimensional object models, thetwo-dimensional image determined by the optimum pose, and thethree-dimensional model, using the comparative images of the optimumposes which correspond to all the learning images (step 221).

Finally, variation-specific three-dimensional weighting coefficientgenerator 63 registers the generated reference three-dimensionalweighting coefficients in variation-specific reference three-dimensionalweighting coefficient storage unit 67 (step 231).

Advantages of the third embodiment will be described below.

According to the present embodiment, comparative images are generatedfrom reference three-dimensional object models depending on posecandidates, and three-dimensional weighting coefficients are convertedinto two-dimensional weighting coefficients, so that weighted distancesare calculated. Therefore, highly accurate pose estimation and matchingcan be performed by setting appropriate weighting coefficients dependingon poses.

According to the present invention, furthermore, since only onethree-dimensional weighting coefficient is used for all poses, anappropriate three-dimensional weighting coefficient depending on adesired pose can be established in a smaller storage capacity than iftwo-dimensional weighting coefficients are to be held for respectiveposes.

According to the present embodiment, furthermore, because the degree ofimportance in matching of each pixel is learned on a three-dimensionalmodel, an appropriate three-dimensional weighting coefficient withrespect to an arbitrary pose can be determined with fewer learningimages than learning images corresponding to all poses.

According to the present embodiment, moreover, variation-specificthree-dimensional weighting coefficients corresponding variations whichcan occur in the input image are held, a variation is estimated from theinput image, and a corresponding three-dimensional weighting coefficientis employed. Therefore, highly accurate pose estimation and matching canbe performed by setting appropriate weighting coefficients depending onvariations that may occur as object deformations and illuminatingcondition variations.

A specific example of operation of the third embodiment will bedescribed below.

Image variations may be represented by object deformations andilluminating condition variations, for example. Illuminating conditionvariations may be three variations: an right illuminating direction(b=1), a front illuminating direction (b=2), and a left illuminatingdirection (b=3), as shown in FIG. 26, for example.

As shown in FIG. 10, reference three-dimensional object model storageunit 55 stores reference three-dimensional object models(three-dimensional shapes P_(Q) ^(k) (x,y,z) and textures T_(Q) ^(k)(R,G,B)) of objects C_(k). As shown in FIG. 27, variation-specificreference three-dimensional weighting coefficient storage unit 67 storesvariation-specific reference three-dimensional weighting coefficientsV_(Q) ^(kb). The variation-specific reference three-dimensionalweighting coefficients can be generated by grouping learning images withrespect to each variation manually or automatically using variationestimator 35, and learning reference three-dimensional weightingcoefficients with respect to each of the groups.

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 28 isobtained by image input unit 10 (step 100 shown in FIG. 24). Posecandidate determining unit 30 determines a pose candidate group {e_(j)}(step 110). Comparative image generator 20 generates comparative imagesG_(1j)(r) having illuminating conditions close to those of the inputimage, with respect to the respective pose candidates e_(j), based onthe reference three-dimensional object model C₁ (step 120). An exampleof comparative images generated with respect to the model C₁ is shown inFIG. 29.

Variation estimator 35 determines a correspondence between an area of athree-dimensional object model and the input image, using the posecandidates e_(j) and the reference three-dimensional object modelsC_(k), estimates a variation b based on image information of a givenarea, and sends a reference weighting coefficient based on the estimatedvariation b, among the variation-specific reference weightingcoefficients stored in variation-specific reference three-dimensionalweighting coefficient storage unit 67, to weighting coefficientconverter 60 (step 180).

Illumination variations are estimated as follows: If the averageluminance values of right and left halves of a face are represented byL₁, L₂, respectively, then the front illuminating direction (b=2) isestimated when |L₁−L₂|≦Th (Th represents a threshold value), the rightilluminating direction (b=1) is estimated when L₁>L₂+Th, and the leftilluminating direction (b=3) is estimated when L₂>L₁+Th.

If the pose candidate for the input image shown in FIG. 28 is assumed tobe represented by e₁, then since the pose does not match the inputimage, the shadow on the left half of the face is interpreted as notbeing large, the difference between the average luminance valuessatisfies the relationship |L₁−L₂|≦Th, and the front illuminatingdirection (b=2) is determined. Similarly, if the pose candidate for theinput image is assumed to be represented by e₂, then the frontilluminating direction (b=2) is also determined. If the pose candidatefor the input image is assumed to be represented by e₃, then the rightilluminating direction (b=1) is correctly determined. The pose can moreaccurately be estimated by using a light source direction estimatingmeans in an image matching method disclosed in JP-2002-24830A, forexample. Weighting coefficient converter 60 converts referencethree-dimensional weighting coefficients V_(Q) ^(1b) of the variation binto two-dimensional weighting coefficients W_(1j)b(r) depending on thepose candidates e_(j) (step 132). An example of two-dimensionalweighting coefficients generated based on the results of the above poseestimation is shown in FIG. 30.

Weighted matching and pose selecting unit 40 calculates weighteddistance values D_(1j) between the input image I(r) and the comparativeimages G_(1j)(r), using the two-dimensional weighting coefficientsW_(1jb)(r) (step 140). Finally, weighted matching and pose selectingunit 40 selects a comparative image (pose candidate) whose distancevalue up to the model C₁ is the smallest, according to D₁=min_(j)D_(1j),thereby estimating an optimum pose (step 150).

4th Embodiment

Referring to FIG. 31, an object pose estimating and matching systemaccording to a fourth embodiment of the present invention comprisesimage input unit 10, comparative image generator 20, pose candidatedetermining unit 30, weighted matching and pose selecting unit 40,weighting coefficient converter 61, reference three-dimensional objectmodel storage unit 55, reference three-dimensional basic point storageunit 75, standard three-dimensional basic point storage unit 76,variation estimator 35, variation-specific standard three-dimensionalweighting coefficient storage unit 68, and a registration unit 4.Registration unit 4 comprises three-dimensional object model register 50and three-dimensional basic point register 70.

Image input unit 10, comparative image generator 20, pose candidatedetermining unit 30, weighted matching and pose selecting unit 40,weighting coefficient converter 61, reference three-dimensional objectmodel storage unit 55, reference three-dimensional basic point storageunit 75, standard three-dimensional basic point storage unit 76,three-dimensional object model register 50, and three-dimensional basicpoint register 70 operate in the same manner as the components denotedby the identical reference numerals according to the second embodimentshown in FIG. 17. Variation estimator 35 operates in the same manner asvariation estimator 35 according to the third embodiment shown in FIG.23. Variation-specific standard three-dimensional weighting coefficientstorage unit 68 stores standard three-dimensional weighting coefficientscorresponding to image variations.

Overall operation of the present embodiment for pose estimation will bedescribed in detail below with reference to FIG. 31 and a flowchartshown in FIG. 32.

First, an input image of a model (object) is obtained by image inputunit 10 (step 100 in FIG. 32). Then, pose candidate determining unit 30determines a pose candidate group {e_(j)} (step 110). Then, comparativeimage generator 20 generates comparative images having illuminatingconditions close to those of the input image, with respect to therespective pose candidates, based on reference three-dimensional objectmodels C_(k) obtained from reference three-dimensional object modelstorage unit 55 (step 120).

Variation estimator 35 determines a correspondence between an area of athree-dimensional object model and the input image, using the posecandidates e_(j) and the reference three-dimensional object modelsC_(k), estimates a variation b based on image information of a givenarea, and sends a standard weighting coefficient based on the estimatedvariation b, among the variation-specific standard weightingcoefficients stored in variation-specific standard three-dimensionalweighting coefficient storage unit 68, to weighting coefficientconverter 61 (step 180).

Weighting coefficient converter 61 determines a coordinatecorrespondence between the standard three-dimensional weightingcoefficient of variation b obtained from variation-specific standardthree-dimensional weighting coefficient storage unit 68 and thereference three-dimensional object models, depending on the posecandidate, using the standard three-dimensional basic points obtainedfrom standard three-dimensional basic point storage unit 76 and thereference three-dimensional basic points obtained from referencethree-dimensional basic point storage unit 75, and converts the standardthree-dimensional weighting coefficients into two-dimensional weightingcoefficients.

Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values D_(kj) (or similarity degrees) between theinput image and the comparative images, using the two-dimensionalweighting coefficients (step 140), and selects a comparative image (posecandidate) whose distance value up to the model (object) is thesmallest, thereby estimating an optimum pose (step 150).

Advantages of the fourth embodiment will be described below.

According to the present embodiment, comparative images are generatedfrom reference three-dimensional object models depending on posecandidates, and three-dimensional weighting coefficients are convertedinto two-dimensional weighting coefficients, so that weighted distancesare calculated. Therefore, highly accurate pose estimation and matchingcan be performed by setting appropriate weighting coefficients dependingon poses.

According to the present invention, furthermore, since only onethree-dimensional weighting coefficient is used for all poses, anappropriate three-dimensional weighting coefficient depending on anarbitrary pose can be established in a smaller storage capacity than iftwo-dimensional weighting coefficients are to be held for respectiveposes.

According to the present embodiment, furthermore, because a standardthree-dimensional weighting coefficient representing an average ofthree-dimensional weighting coefficients of a plurality of objects isheld, the storage capacity for storing the standard three-dimensionalweighting coefficient is much smaller than if referencethree-dimensional weighting coefficients are to be held for objects. Itis not necessary to capture learning images corresponding to referencethree-dimensional object models upon registration.

According to the present embodiment, moreover, variation-specificthree-dimensional weighting coefficients corresponding variations whichcan occur in the input image are held, a variation is estimated from theinput image, and a corresponding three-dimensional weighting coefficientis employed. Therefore, highly accurate pose estimation and matching canbe performed by setting appropriate weighting coefficients depending onvariations that may occur as object deformations and illuminatingcondition variations.

A specific example of operation of the fourth embodiment will bedescribed below.

As shown in FIG. 10, reference three-dimensional object model storageunit 55 stores reference three-dimensional object models(three-dimensional shapes P_(Q) ^(k) (x,y,z) and textures T_(Q) ^(k)(R,G,B)) of objects C_(k). As shown in FIG. 20, standardthree-dimensional basic point storage unit 76 stores the coordinates ofstandard three-dimensional basic points N_(i) ⁰. As shown in FIG. 21,reference three-dimensional basic point storage unit 75 stores thecoordinates of reference three-dimensional basic points N_(i) ^(k).Furthermore, as shown in FIG. 33, Variation-specific standardthree-dimensional weighting coefficient storage unit 68 stores standardthree-dimensional weighting coefficients V_(Q) ^(0b) classifiedaccording to variations. The variation-specific standardthree-dimensional weighting coefficients can be generated by groupinglearning images with respect to each variation manually or automaticallyusing variation estimator 35, and learning standard three-dimensionalweighting coefficients with respect to each of the groups.

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 28 isobtained by image input unit 10 (step 100 shown in FIG. 33). Posecandidate determining unit 30 determines a pose candidate group {e_(j)}(step 110). Comparative image generator 20 generates comparative imagesG_(1j)(r) having illuminating conditions close to those of the inputimage, with respect to the respective pose candidates e_(j), based onthe reference three-dimensional object model C₁ (step 120). An exampleof comparative images generated with respect to the model C₁ is shown inFIG. 29.

Variation estimator 35 determines a correspondence between an area of athree-dimensional object model and the input image, using the posecandidates e_(j) and the reference three-dimensional object modelsC_(k), estimates a variation b based on image information of a givenarea, and sends a standard weighting coefficient of the correspondingvariation b, among the variation-specific standard weightingcoefficients stored in variation-specific standard three-dimensionalweighting coefficient storage unit 68, to weighting coefficientconverter 61 (step 180).

Weighting coefficient converter 61 determines a coordinatecorrespondence between the standard three-dimensional weightingcoefficients V_(Q) ^(0b) of the variation b and the referencethree-dimensional object models PQ^(k) depending on the pose candidatese_(j), using the standard three-dimensional basic points N_(i) ⁰ and thereference three-dimensional basic points N_(i) ^(k), and converts thestandard three-dimensional weighting coefficients V_(Q) ^(0b) intotwo-dimensional weighting coefficients W_(1jb)(r) (step 133). An exampleof two-dimensional weighting coefficients generated based on the resultsof the above pose estimation is shown in FIG. 34.

Weighted matching and pose selecting unit 40 calculates weighteddistance values D_(1j) between the input image I(r) and the comparativeimages G_(1j)(r), using the two-dimensional weighting coefficientsW_(1jb)(r) (step 140). Finally, weighted matching and pose selectingunit 40 selects a comparative image (pose candidate) whose distancevalue up to the model C₁ is the smallest, according to D₁=min_(j)D_(1j),thereby estimating an optimum pose (step 150).

5th Embodiment

Referring to FIG. 35, an object pose estimating and matching systemaccording to a fifth embodiment of the present invention comprises imageinput unit 10, normalizer 15, weighted matching and pose selecting unit40, pose-specific reference image storage unit 85, pose-specificreference weighting coefficient storage unit 95, and registration unit7. Registration unit 7 comprises pose-specific reference image register80, matching and pose selecting unit 41, and pose-specific weightingcoefficient generator 90.

Image input unit 10, normalizer 15, matching and pose selecting unit 41,and pose-specific reference image storage unit 85 operate in the samemanner as the components denoted by the identical reference numeralsaccording to the first prior art. Weighted matching and pose selectingunit 40 calculates weighted distance values between the normalized imageand pose-specific reference images obtained from pose-specific referenceimage storage unit 85, using the pose-specific weighting coefficientsobtained from pose-specific reference weighting coefficient storage unit95, and selects a reference image whose distance value is the smallest,thereby estimating an optimum pose. Pose-specific reference imageregister 80 registers pose-specific reference images in pose-specificreference image storage unit 85. Pose-specific weighting coefficientgenerator 90 generates pose-specific reference weighting coefficients bylearning the degree of importance in matching of each pixel, withrespect to respective poses using a reference image of the optimum poseobtained from matching and pose selecting unit 41 and the input image,and registers the generated pose-specific reference weightingcoefficients in pose-specific reference weighting coefficient storageunit 95.

Overall operation of the fifth embodiment for pose estimation will bedescribed in detail below with reference to FIG. 35 and a flowchartshown in FIG. 36.

First, an input image is obtained by image input unit 10 (step 100 inFIG. 36). Then, normalizer 15 aligns the input image using featurepoints extracted from the object, and generates a normalized image (step101).

Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values (or similarity degrees) between the normalizedimage and the pose-specific reference images obtained from pose-specificreference image storage unit 85, using the pose-specific referenceweighting coefficients obtained from pose-specific reference weightingcoefficient storage unit 95 (step 145), and selects a reference image(pose) whose distance value up to the object is the smallest, therebyestimating an optimum pose (step 155).

Overall operation of the fifth embodiment for registration will bedescribed in detail below with reference to FIG. 35 and a flowchartshown in FIG. 37.

First, pose-specific reference image register 80 registers referenceimages of objects C_(k) in pose-specific reference image storage unit 85(step 310 in FIG. 37). Then, pose-specific weighting coefficientgenerator 90 first sets an image number h=1 (step 210) and then enters alearning image having the image number h from image input unit 10 (step200) in order to learn reference three-dimensional weightingcoefficients using the learning image and the reference images.

Then, normalizer 15 aligns the input image using feature pointsextracted from the object, and generates a normalized image (step 101).Matching and pose selecting unit 41 calculates distance values D_(kj)′(or similarity degrees) between the input image and the reference images(step 145), and selects one of the reference images (pose candidates)whose distance value up to the model (object) is the smallest, therebyestimating an optimum pose (step 155).

Then, pose-specific weighting coefficient generator 90 increments theimage number h by 1 (step 211). If the image number h is equal to orsmaller than the number N of learning images (step 212), then controlgoes back to step 200 for determining a comparative image having anoptimum pose which corresponds to a next learning image. If the imagenumber h is greater than the number N of learning images, thenpose-specific weighting coefficient generator 90 generates pose-specificreference weighting coefficients by learning the degree of importance inmatching of each pixel with respect to each of the poses, using thereference images of the optimum poses which correspond to all thelearning images (step 225). The fifth embodiment differs from the firstembodiment only in that it has pose-specific reference images andreference weighing coefficients instead of one referencethree-dimensional object model and reference three-dimensional weightingcoefficients. That is, the comparative images for the respective posecandidates and the two-dimensional weighting coefficients in the firstembodiment correspond to the reference images and the referenceweighting coefficients in the fifth embodiment. Therefore, with respectto the learning of the degree of importance, if the errors d_(kj)^(h)(r) in the first embodiment are calculated using the referenceimages in place of the comparative images G and a (two-dimensional)average error thereof is represented by E^(kj), then the referenceweighting coefficients can be calculated according to W_(kj)=A/E^(kj).

Finally, the pose-specific reference weighting coefficients areregistered in pose-specific reference weighting coefficient storage unit95 (step 235).

Advantages of the fifth embodiment will be described below.

According to the present embodiment, weighted distances are calculatedusing pose-specific weighting coefficients corresponding topose-specific reference images. Therefore, highly accurate poseestimation and matching can be performed by setting appropriateweighting coefficients depending on poses.

A specific example of operation of the fifth embodiment will bedescribed below.

As shown in FIG. 38, pose-specific reference image storage unit 85stores pose-specific reference images R_(kj) of objects C_(k). As shownin FIG. 39, reference weighting coefficient storage unit 95 storespose-specific reference weighting coefficients W_(kj).

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 12 isobtained by image input unit 10 (step 100 shown in FIG. 36). Then,normalizer 15 aligns the input image using feature points extracted fromthe object, and generates a normalized image I′(r) (step 101). Anexample of a normalized image with respect to the input image shown inFIG. 12 is illustrated in FIG. 40. Finally, weighted matching and poseselecting unit 40 calculates weighted distance values D_(1j) (orsimilarity degrees) between the normalized image I′(r) and thepose-specific reference images R_(1j)(r) obtained from pose-specificreference image storage unit 85, using the pose-specific referenceweighting coefficients W_(1j) of respective objects obtained frompose-specific reference weighting coefficient storage unit 95 (step145), and selects a reference image (pose) whose distance value up tothe object is the smallest, thereby estimating an optimum pose (step155). If the Euclidean distance is used, then weighting is calculatedaccording to D_(kj)=Σ_(r)W_(kj)(r){I′(r)−R_(kj)(r)}². For the normalizedimage shown in FIG. 40, R₁₃ of the pose e₃, for example, is selected asa comparative image whose distance value is the smallest.

6th Embodiment

Referring to FIG. 41, an object pose estimating and matching systemaccording to a sixth embodiment of the present invention comprises imageinput unit 10, normalizer 15, weighted matching and pose selecting unit40, pose-specific reference image storage unit 85, pose-specificstandard weighting coefficient storage unit 96, and registration unit 9.Registration unit 9 comprises pose-specific reference image register 80.

Image input unit 10, normalizer 15, and pose-specific reference imagestorage unit 85 operate in the same manner as the components denoted bythe identical reference numerals according to the fifth embodiment shownin FIG. 35. Pose-specific standard weighting coefficient storage unit 96stores pose-specific standard weighting coefficients. Weighted matchingand pose selecting unit 40 calculates weighted distance values betweenthe normalized image and pose-specific reference images obtained frompose-specific reference image storage unit 85, using the pose-specificstandard weighting coefficients obtained from pose-specific standardweighting coefficient storage unit 96, and selects a reference imagewhose distance value is the smallest, thereby estimating an optimumpose.

Overall operation of the sixth embodiment for pose estimation will bedescribed in detail below with reference to FIG. 41 and a flowchartshown in FIG. 42.

First, an input image is obtained by image input unit 10 (step 100 inFIG. 42). Then, normalizer 15 aligns the input image using featurepoints extracted from the object, and generates a normalized image (step101). Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values (or similarity degrees) between the normalizedimage and the pose-specific reference images of respective objectsobtained from pose-specific reference image storage unit 85, using thepose-specific standard weighting coefficients obtained frompose-specific standard weighting coefficient storage unit 96 (step 146),and selects a reference image (pose) whose distance value up to theobject is the smallest, thereby estimating an optimum pose (step 155).

Advantages of the sixth embodiment will be described below.

According to the present embodiment, weighted distances are calculatedusing pose-specific weighting coefficients corresponding topose-specific reference images. Therefore, highly accurate poseestimation and matching can be performed by setting appropriateweighting coefficients depending on poses.

According to the present embodiment, furthermore, because a standardweighting coefficient representing an average of weighting coefficientsof a plurality of objects is held, the storage capacity for storing thestandard three-dimensional weighting coefficient is much smaller than ifreference three-dimensional weighting coefficients are to be held forobjects. It is not necessary to capture learning images corresponding toreference images upon registration.

A specific example of operation of the sixth embodiment will bedescribed below.

As shown in FIG. 38, pose-specific reference image storage unit 85stores pose-specific reference images R_(kj) of objects C_(k). As shownin FIG. 43, pose-specific standard weighting coefficient storage unit 96stores pose-specific standard weighting coefficients W_(0j). Thepose-specific standard weighting coefficients can be determined byaveraging pose-specific reference weighting coefficients for each poseor learning prepared learning reference images for each pose.

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 12 isobtained by image input unit 12 (step 100 shown in FIG. 42). Then,normalizer 15 aligns the input image using feature points extracted fromthe object, and generates a normalized image I′(r) (step 101). Anexample of a normalized image with respect to the input image shown inFIG. 12 is illustrated in FIG. 40. Finally, weighted matching and poseselecting unit 40 calculates weighted distance values D_(1j) (orsimilarity degrees) between the normalized image I′(r) and thepose-specific reference images R_(1j)(r) of respective objects obtainedfrom pose-specific reference image storage unit 85, using thepose-specific standard weighting coefficients W_(0j) obtained frompose-specific standard weighting coefficient storage unit 96 (step 146),and selects a reference image (pose) whose distance value up to theobject is the smallest, thereby estimating an optimum pose (step 155).

7th Embodiment

Referring to FIG. 44, an object pose estimating and matching systemaccording to a seventh embodiment of the present invention comprisesimage input unit 10, normalizer 15, weighted matching and pose selectingunit 40, pose-specific reference image storage unit 85, pose- andvariation-specific reference weighting coefficient storage unit 97,variation estimator 36, standard three-dimensional object model storageunit 56, and registration unit 8. Registration unit 8 comprisespose-specific reference image register 80, matching and pose selectingunit 41, and pose- and variation-specific weighting coefficientgenerator 91.

Image input unit 10, normalizer 15, weighted matching and pose selectingunit 40, pose-specific reference image storage unit 85, pose-specificreference image register 80, and matching and pose selecting unit 41operate in the same manner as the components denoted by the identicalreference numerals according to the fifth embodiment.

Pose- and variation-specific reference weighting coefficient storageunit 97 stores pose- and variation-specific weighting coefficients.Standard three-dimensional object model storage unit 56 stores standardthree-dimensional object models.

Variation estimator 36 determines a correspondence between thenormalized image obtained from normalizer 15 and an area of athree-dimensional object model, using the pose information of thereference images obtained from pose-specific reference image storageunit 85 and the standard three-dimensional object models obtained fromstandard three-dimensional object model storage unit 56, and estimates avariation based on image information of a given area. Furthermore,variation estimator 36 sends a corresponding pose- andvariation-specific weighting coefficient based on the estimatedvariation, among the pose- and variation-specific weighting coefficientsstored in pose- and variation-specific reference three-dimensionalweighting coefficient storage unit 97, to weighted matching and poseselecting unit 40.

Pose- and variation-specific weighting coefficient generator 91generates pose- and variation-specific reference weighting coefficientsby learning the degree of importance in matching of each pixel for eachimage variation obtained from variation estimator 36, using thereference image of the optimum pose obtained from matching and poseselecting unit 41 and the input image, and registers the generated pose-and variation-specific reference weighting coefficients in pose- andvariation-specific reference weighting coefficient storage unit 97.

Overall operation of the seventh embodiment for pose estimation will bedescribed in detail below with reference to FIG. 44 and a flowchartshown in FIG. 45.

First, an input image is obtained by image input unit 10 (step 100 inFIG. 45). Then, normalizer 15 aligns the input image using featurepoints extracted from the object, and generates a normalized image (step101).

Variation estimator 36 determines a correspondence between thenormalized image and an area of a three-dimensional object model, usingthe pose information of the reference images obtained from pose-specificreference image storage unit 85 and the standard three-dimensionalobject models obtained from standard three-dimensional object modelstorage unit 56, estimates a variation b based on image information of agiven area, and sends a corresponding pose- and variation-specificweighting coefficient based on the estimated variation, among the pose-and variation-specific weighting coefficients stored in pose- andvariation-specific reference three-dimensional weighting coefficientstorage unit 97, to weighted matching and pose selecting unit 40 (step181).

Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values (or similarity degrees) between the normalizedimage and the pose-specific reference images obtained from pose-specificreference image storage unit 85, using the pose- and variation-specificreference weighting coefficients obtained from pose- andvariation-specific reference weighting coefficient storage unit 97 (step147), and selects a reference image (pose) whose distance value up tothe object is the smallest, thereby estimating an optimum pose (step155).

Overall operation of the seventh embodiment for registration will bedescribed in detail below with reference to FIG. 44 and a flowchartshown in FIG. 46.

First, pose-specific reference image register 80 registers referenceimages of objects C_(k) in pose-specific reference image storage unit 85(step 310 in FIG. 46).

Then, pose- and variation-specific weighting coefficient generator 91first sets an image number h=1 (step 210) and then enters a learningimage having the image number h from image input unit 10 (step 200) inorder to learn reference weighting coefficients using the learning imageand the reference images.

Then, normalizer 15 aligns the input image using feature pointsextracted from the object, and generates a normalized image (step 101).Matching and pose selecting unit 41 calculates distance values D_(kj)′(or similarity degrees) between the input image and the reference images(step 145), and selects one of the reference images (pose candidates)whose distance value up to the model (object) is the smallest, therebyestimating an optimum pose (step 155).

Then, variation estimator 36 determines a correspondence between thenormalized image and an area of a three-dimensional object model, usingthe pose information of the reference images obtained from pose-specificreference image storage unit 85 and the standard three-dimensionalobject models obtained from standard three-dimensional object modelstorage unit 56, and estimates a variation b based on image informationof a given area (step 181).

Then, pose- and variation-specific weighting coefficient generator 91increments the image number h by 1 (step 211). If the image number h isequal to or smaller than the number N of learning images (step 212),then control goes back to step 200 for determining a comparative imagehaving an optimum pose which corresponds to a next learning image.

If the image number h is greater than the number N of learning images,then pose- and variation-specific weighting coefficient generator 91generates pose- and variation-specific reference weighting coefficientsby learning the degree of importance in matching of each pixel withrespect to each pose and variation b, using the reference images of theoptimum poses which correspond to all the learning images (step 226).

Finally, pose- and variation-specific weighting coefficient generator 91registers the generated pose- and variation-specific reference weightingcoefficients in pose- and variation-specific reference weightingcoefficient storage unit 97 (step 236).

Advantages of the seventh embodiment will be described below.

According to the present embodiment, weighted distances are calculatedusing pose-specific weighting coefficients corresponding topose-specific reference images. Therefore, highly accurate poseestimation and matching can be performed by setting appropriateweighting coefficients depending on poses.

According to the present invention, moreover, pose- andvariation-specific weighting coefficients corresponding variations whichcan occur in the input image are held, a variation is estimated from thenormalized image, and a corresponding weighting coefficient is employed.Therefore, highly accurate pose estimation and matching can be performedby setting appropriate weighting coefficients depending on variationsthat may occur as object deformations and illuminating conditionvariations.

A specific example of operation of the seventh embodiment will bedescribed below.

As shown in FIG. 38, pose-specific reference image storage unit 85stores pose-specific reference images R_(kj) of objects C_(k). As shownin FIG. 47, pose- and variation-specific reference weighting coefficientstorage unit 97 stores pose- and variation-specific reference weightingcoefficients W_(kjb).

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 28 isobtained by image input unit 12 (step 100 shown in FIG. 45).

Then, normalizer 15 aligns the input image using feature pointsextracted from the object, and generates a normalized image I′(r) (step101).

Variation estimator 36 determines a correspondence between thenormalized image and an area of a three-dimensional object model, usingthe pose information e_(j) of the reference images R_(1j) obtained frompose-specific reference image storage unit 85 and the standardthree-dimensional object models obtained from standard three-dimensionalobject model storage unit 56, estimates a variation b based on imageinformation of a given area, and sends a corresponding pose- andvariation-specific weighting coefficient based on the estimatedvariation, among the pose- and variation-specific weighting coefficientsW_(kjb) stored in pose- and variation-specific referencethree-dimensional weighting coefficient storage unit 97, to weightedmatching and pose selecting unit 40 (step 181). It is determined whicharea of the standard three-dimensional object models each pixel of thereference images corresponds to, from the pose information of thereference images and the standard three-dimensional object models. Sincethe normalized image is matched, with an assumed pose, against thereference images, it is determined which area of the standardthree-dimensional object models each pixel of the normalized imagecorresponds to using the correspondence of pixels between the normalizedimage and the reference images. The process of estimating a variationbased on the image information of the given area is the same as theprocess in the third embodiment. For example, a variation is estimatedfrom the average luminance values of right and left halves of a face,for example. As it is determined which area of the standardthree-dimensional object models each pixel of the normalized imagecorresponds to, the average luminance values of right and left halves ofa face can be calculated using the luminance values of the pixels of thenormalized image, and a variation can be estimated.

Illumination variations are estimated as follows: If the frontilluminating direction (b=2), the front illuminating direction (b=2),and the right illuminating direction (b=1) are determined respectivelywith respect to the pose information e₁, e₂, e₃, then pose- andvariation-specific weighting coefficients W₁₁₂, W₁₂₂, W₁₃₁ are selected.Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values D_(1j) between the normalized image I′(r) andthe pose-specific reference images R_(1j), using the pose- andvariation-specific reference weighting coefficients W_(1jb)(r) obtainedfrom pose- and variation-specific weighting coefficient storage unit 97(step 147), and selects a reference image (pose) whose distance value upto the object is the smallest, thereby estimating an optimum pose (step155). For the normalized image shown in FIG. 40, the distance value isthe smallest when the pose is represented by e₃ and the pose- andvariation-specific weighting coefficient is represented by W₁₃₁ for thevariation (illuminating condition) b=1, and the comparative image R₁₃whose distance value is the smallest is selected.

8th Embodiment

Referring to FIG. 48, an object pose estimating and matching systemaccording to an eighth embodiment of the present invention comprisesimage input unit 10, normalizer 15, weighted matching and pose selectingunit 40, pose-specific reference image storage unit 85, pose- andvariation-specific standard weighting coefficient storage unit 98,variation estimator 36, standard three-dimensional object model storageunit 56, and registration unit 9. Registration unit 9 comprisespose-specific reference image register 80.

Image input unit 10, normalizer 15, weighted matching and pose selectingunit 40, pose-specific reference image storage unit 85, variationestimator 36, standard three-dimensional object model storage unit 56,and pose-specific reference image register 80 operate in the same manneras the components denoted by the identical reference numerals accordingto the seventh embodiment shown in FIG. 44. Weighted matching and poseselecting unit 40 calculates weighted distance values between thenormalized image obtained from normalizer 15 and pose-specific referenceimages obtained from pose-specific reference image storage unit 85,using the pose- and variation-specific standard weighting coefficientsobtained from pose- and variation-specific standard weightingcoefficient storage unit 98, and selects a reference image whosedistance value is the smallest, thereby estimating an optimum pose.

Overall operation of the eighth embodiment for pose estimation will bedescribed in detail below with reference to FIG. 44 and a flowchartshown in FIG. 49.

First, an input image is obtained by image input unit 10 (step 100 inFIG. 49). Then, normalizer 15 aligns the input image using featurepoints extracted from the object, and generates a normalized image (step101).

Variation estimator 36 determines a correspondence between thenormalized image and an area of a three-dimensional object model, usingthe pose information of the reference images obtained from pose-specificreference image storage unit 85 and the standard three-dimensionalobject models obtained from standard three-dimensional object modelstorage unit 56, estimates a variation b based on image information of agiven area, and sends a corresponding pose- and variation-specificstandard weighting coefficient based on the estimated variation, amongthe pose- and variation-specific weighting coefficients stored in pose-and variation-specific standard weighting coefficient storage unit 98,to weighted matching and pose selecting unit 40 (step 181).

Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values (or similarity degrees) between the normalizedimage and the pose-specific reference images obtained from pose-specificreference image storage unit 85, using the pose- and variation-specificstandard weighting coefficients obtained from pose- andvariation-specific standard weighting coefficient storage unit 98 (step147), and selects a reference image (pose) whose distance value up tothe object is the smallest, thereby estimating an optimum pose (step155).

Advantages of the eighth embodiment will be described below.

According to the eighth embodiment, weighted distances are calculatedusing pose-specific weighting coefficients corresponding topose-specific reference images. Therefore, highly accurate poseestimation and matching can be performed by setting appropriateweighting coefficients depending on poses.

According to the present embodiment, furthermore, because a standardweighting coefficient representing an average of weighting coefficientsof a plurality of objects is held, the storage capacity for storing thestandard three-dimensional weighting coefficient is much smaller than ifreference three-dimensional weighting coefficients are to be held forobjects. It is not necessary to capture learning images corresponding toreference images upon registration.

According to the present invention, moreover, pose- andvariation-specific weighting coefficients corresponding variations whichcan occur in the input image are held, a variation is estimated from thenormalized image, and a corresponding weighting coefficient is employed.Therefore, highly accurate pose estimation and matching can be performedby setting appropriate weighting coefficients depending on variationsthat may occur as object deformations and illuminating conditionvariations.

A specific example of operation of the eighth embodiment will bedescribed below.

As shown in FIG. 38, pose-specific reference image storage unit 85stores pose-specific reference images R_(kj) of objects C_(k). As shownin FIG. 43, pose- and variation-specific standard weighting coefficientstorage unit 98 stores pose- and variation-specific standard weightingcoefficients W_(0jb). The pose- and variation-specific standardweighting coefficients can be determined by averaging pose- andvariation-specific reference weighting coefficients for each pose andvariation or prepared learning reference images for each pose andvariation.

The estimation of a pose with respect to a model C₁ will be describedbelow. It is assumed that an input image I(r) as shown in FIG. 28 isobtained by image input unit 10 (step 100 shown in FIG. 49).

Then, normalizer 15 aligns the input image using feature pointsextracted from the object, and generates a normalized image I′(r) (step101). Variation estimator 36 determines a correspondence between thenormalized image and an area of a three-dimensional object model, usingthe pose information e_(j) of the reference images R_(1j) obtained frompose-specific reference image storage unit 85 and the standardthree-dimensional object models obtained from standard three-dimensionalobject model storage unit 56, estimates a variation b based on imageinformation of a given area, and sends a corresponding pose- andvariation-specific standard weighting coefficient based on the estimatedvariation, among the pose- and variation-specific weighting coefficientsW_(0jb) stored in pose- and variation-specific weighting coefficientstorage unit 98, to weighted matching and pose selecting unit 40 (step181).

Illumination variations are estimated as follows: If the frontilluminating direction (b=2), the front illuminating direction (b=2),and the right illuminating direction (b=1) are determined respectivelywith respect to the pose information e₁, e₂, e₃, then pose- andvariation-specific weighting coefficients W₁₁₂, W₁₂₂, W₁₃₁ are selected.

Finally, weighted matching and pose selecting unit 40 calculatesweighted distance values D_(1j) between the normalized image I′(r) andthe pose-specific reference images R_(1j), using the pose- andvariation-specific standard weighting coefficients W_(0jb)(r) obtainedfrom pose- and variation-specific standard weighting coefficient storageunit 98 (step 147), and selects a reference image (pose) whose distancevalue up to the object is the smallest, thereby estimating an optimumpose (step 155).

In the second and fourth embodiments of the present invention, acorrespondence between standard three-dimensional weighting coefficientsand reference three-dimensional models is determined using basic points,and two-dimensional weighting coefficients are generated. However, thecorrespondence may be calculated in advance, and standardthree-dimensional weighting coefficients may be converted into referencethree-dimensional weighting coefficients, and such referencethree-dimensional weighting coefficients may be stored.

In the fifth through eighth embodiments of the present invention, thepose-specific reference weighting coefficients, the pose-specificstandard weighting coefficients, the pose- and variation-specificreference weighting coefficients, and the pose- and variation-specificstandard weighting coefficients are learned with respect to each pose(and each variation), using pose-specific learning reference images.However, as with the first through fourth embodiments, referencethree-dimensional object models may be used, an error between an inputimage and reference images may be inversely converted into those on thethree-dimensional object models, and three-dimensional weightingcoefficients may be learned and then converted depending on the pose,thereby generating weighting coefficients. The referencethree-dimensional object models may be generated from the learningreference images or may be generated using a three-dimensional shapemeasuring apparatus. Learning reference images do not need to benecessarily prepared depending on poses.

In the third, fourth, seventh, and eighth embodiments, variations havebeen described as occurring in illuminating conditions. However,variations are not limited to occurring in illuminating conditions. Ifvariations occur as object shape deformations (facial changes if anobject is the face of a person), then the variations can be estimatedusing image information of a given area. For example, the opening andclosing of an eye or the opening and closing of a mouth can be estimatedby preparing image templates and matching the eye or the mouth againstthe image templates. Alternatively, the estimation of variations may notbe performed, but all variation-specific three-dimensional weightingcoefficients or pose- and variation-specific weighting coefficients maybe used to perform weighted matching, and a variation whose distancevalue is the smallest (whose similarity degree is the greatest) may beselected.

In each of the embodiments of the present invention, for poseestimation, weighting coefficients are used and weighted distances arecalculated. However, if matching is the purpose to be achieved, then forpose estimation, distance calculations which do not use weightingcoefficients may be carried out to determine an optimum pose, and thenweighted distances may be calculated again.

In each of the embodiments of the present invention, weightingcoefficients are determined according to the reciprocal of an average oferrors of pixels between an input image (or a normalized image) andreference images (or comparative images). However, weightingcoefficients are not limited to being determined in the above manner.Rather than an input image of the same object as for reference images,an input image of a different object may be used and learned. At thistime, if an average error averaged over a learning image of an objectC_(k) is represented by E_(Q) ^(k) and an average error averaged over alearning image of another object by E_(Q) ^(k−), then referencethree-dimensional weighting coefficients may be established according toV_(Q) ^(k)=A′E_(Q) ^(k)−/E_(Q) ^(k) (A′ indicates a normalizationcoefficient), for example.

The functions of the means as the components of the object poseestimating and matching system according to the present invention may behardware-implemented or may be performed by loading an object poseestimating and matching program (application) which performs thefunctions of the above means, into a memory of a computer andcontrolling the computer. The object pose estimating and matchingprogram is stored in a recording medium such as a magnetic disk, asemiconductor memory, or the like, and loaded from the recording mediuminto the computer.

While the preferred embodiments of the present invention have beendescribed above. the present invention is not limited to the aboveembodiments, but may be modified in various ways within the scope of thetechnical ideas thereof.

1. An object pose estimating and matching system comprising: referencethree-dimensional object model storage means for storing, in advance,reference three-dimensional object models of objects; standardthree-dimensional weighting coefficient storage means for storing, inadvance, standard three-dimensional weighting coefficients; referencethree-dimensional basic point storage means for storing, in advance,reference three-dimensional basic points corresponding to said referencethree-dimensional object models; standard three-dimensional basic pointstorage means for storing, in advance, standard three-dimensional basicpoints corresponding to standard three-dimensional object models; posecandidate determining means for determining pose candidates for anobject; comparative image generating means for generating comparativeimages close to an input image depending on said pose candidates, basedon said reference three-dimensional object models; weighting coefficientconverting means for determining a coordinate correspondence betweensaid standard three-dimensional weighting coefficients and saidreference three-dimensional object models, using said standardthree-dimensional basic points and said reference three-dimensionalbasic points, and converting said standard three-dimensional weightingcoefficients into two-dimensional weighting coefficients depending onsaid pose candidates; and weighted matching and pose selecting means forcalculating weighted distance values or similarity degrees between saidinput image and said comparative images, using said two-dimensionalweighting coefficients, and selecting one of the comparative imageswhose distance value up to said object is the smallest or whosesimilarity degree with respect to said object is the greatest, therebyto estimate and match the pose of said object.
 2. An object poseestimating and matching system comprising: reference three-dimensionalobject model storage means for storing, in advance, referencethree-dimensional object models of objects; variation-specific standardthree-dimensional weighting coefficient storage means for storing, inadvance, standard three-dimensional weighting coefficients correspondingto image variations; reference three-dimensional basic point storagemeans for storing, in advance, reference three-dimensional basic pointscorresponding to said reference three-dimensional object models;standard three-dimensional basic point storage means for storing, inadvance, standard three-dimensional basic points corresponding tostandard three-dimensional object models; pose candidate determiningmeans for determining pose candidates for an object; variationestimating means for determining a correspondence between an area of athree-dimensional object model and an input image, using said posecandidates and said reference three-dimensional object models, andestimating a variation based on image information of a given area ofsaid input image; comparative image generating means for generatingcomparative images close to said input image depending on said posecandidates, based on said reference three-dimensional object models;weighting coefficient converting means for determining a coordinatecorrespondence between said standard three-dimensional weightingcoefficients corresponding to the estimated variation and said referencethree-dimensional object models, using said standard three-dimensionalbasic points and said reference three-dimensional basic points, andconverting said standard three-dimensional weighting coefficients intotwo-dimensional weighting coefficients depending on said posecandidates; and weighted matching and pose selecting means forcalculating weighted distance values or similarity degrees between saidinput image and said comparative images, using said two-dimensionalweighting coefficients, and selecting one of the comparative imageswhose distance value up to said object is the smallest or whosesimilarity degree with respect to said object is the greatest, therebyto estimate and match the pose of said object.
 3. An object poseestimating and matching system according to claim 1, further comprising:three-dimensional object model registering means for registeringreference three-dimensional object models in said referencethree-dimensional object model storage means; and three-dimensionalbasic point registering means for determining referencethree-dimensional basic points with respect to said referencethree-dimensional object models, and registering the determinedreference three-dimensional basic points in said referencethree-dimensional basic point storage means.